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ABSTRACT 

We  consider  two-player  games  played  for  an  infinite  number 
of  rounds,  with  cj-regular  winning  conditions.  The  games 
may  be  concurrent,  in  that  the  players  choose  their  moves 
simultaneously  and  independently,  and  probabilistic,  in  that 
the  moves  determine  a  probability  distribution  for  the  suc¬ 
cessor  state.  We  introduce  quantitative  game  p,-calculus,  and 
we  show  that  the  maximal  probability  of  winning  such  games 
can  be  expressed  as  the  fixpoint  formulas  in  this  calculus. 
We  develop  the  arguments  both  for  deterministic  and  for 
probabilistic  concurrent  games;  as  a  special  case,  we  solve 
probabilistic  turn-based  games  with  tj-regular  winning  con¬ 
ditions,  which  was  also  open.  We  also  characterize  the  opti¬ 
mality,  and  the  memory  requirements,  of  the  winning  strate¬ 
gies.  In  particular,  we  show  that  while  memoryless  strategies 
suffice  for  winning  games  with  safety  and  reachability  con¬ 
ditions,  Biichi  conditions  require  the  use  of  strategies  with 
infinite  memory.  The  existence  of  optimal  strategies,  as  op¬ 
posed  to  £-optimal,  is  only  guaranteed  in  games  with  safety 
winning  conditions. 

1.  INTRODUCTION 

We  consider  two-player  games  played  on  finite  state  spaces 
for  an  infinite  number  of  rounds.  In  each  round,  depending 
on  the  current  state  of  the  game,  the  moves  of  one  or  both 
players  determine  the  next  state  [25];  we  consider  games 
in  which  the  set  of  available  moves  is  finite.  Such  games 
offer  a  model  for  systems  composed  of  interacting  compo¬ 
nents,  and  they  have  been  studied  under  a  wide  range  of 
winning  conditions.  The  winning  conditions  are  often  cod¬ 
ified  by  associating  a  reward  with  each  state  and  choice  of 
moves,  and  by  studying  the  maximal  discounted,  total,  or 
average  reward  that  player  1  can  obtain  in  such  a  game;  a 
survey  of  algorithms  for  solving  games  with  respect  to  such 
winning  conditions  is  e.g.  [24,  10].  Here,  we  consider  win- 
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ning  conditions  consisting  in  tj-regular  automata  acceptance 
conditions  defined  over  the  state  space  of  the  game  [2,  11, 
26].  Given  a  game  with  an  cj-regular  winning  condition  and 
a  starting  state  s,  we  study  the  maximal  probability  with 
which  player  1  can  ensure  that  the  condition  holds  from  s; 
we  call  this  maximal  probability  the  value  of  the  game  at  s 
for  player  1.  The  determinacy  result  of  [17]  ensures  that,  at 
all  states  and  for  all  cj-regular  winning  conditions,  the  value 
of  the  game  for  player  1  is  equal  to  one  minus  the  value  of 
the  game  with  complementary  condition  for  player  2. 

We  distinguish  between  turn-based  and  concurrent  games, 
and  between  deterministic  and  probabilistic  games.  Systems 
in  which  the  interaction  between  the  components  is  asyn¬ 
chronous  give  rise  to  turn-based  games,  where  in  each  round 
only  one  of  the  two  players  can  choose  among  several  moves. 
On  the  other  hand,  synchronous  interaction  leads  to  concur¬ 
rent  games,  where  in  each  round  both  players  can  choose  si¬ 
multaneously  and  independently  among  several  moves.  The 
games  are  deterministic  if  the  current  state  and  the  moves 
uniquely  determine  the  successor  state,  and  are  probabilistic 
if  the  current  state  and  the  moves  determine  a  probability 
distribution  for  the  successor  state.  For  any  cj-regular  win¬ 
ning  condition,  the  value  of  a  deterministic  turn-based  game 
at  a  state  is  either  0  or  1;  moreover,  player  1  can  achieve  this 
value  by  playing  according  to  a  deterministic  strategy,  that 
select  a  move  based  on  the  current  state  and  on  the  history 
of  the  game  [2,  11].  In  contrast,  the  value  of  a  concurrent 
game  at  a  state  may  be  strictly  between  0  and  1;  further¬ 
more,  achieving  this  value  may  require  the  use  of  randomized 
strategies,  that  select  not  a  move,  but  a  probability  distri¬ 
bution  over  moves.  To  see  this,  consider  the  concurrent 
game  MatchOneBit.  The  game  starts  at  state  so,  where 
both  players  simultaneously  and  independently  choose  a  bit 
(0  or  1);  if  the  bits  match,  the  game  proceeds  to  state  sra,„, 
otherwise,  it  proceeds  to  state  si„ae.  Once  at  smin  (resp. 
s lose )  the  state  is  confined  there  forever.  Consider  the  safety 
condition  □  {so,  swin},  requiring  that  siose  is  not  entered.  For 
every  deterministic  strategy  of  player  1,  player  2  has  another 
(complementary)  deterministic  strategy  that  ensures  a  tran¬ 
sition  to  siose',  hence,  if  player  1  could  only  use  deterministic 
strategies,  he  would  win  with  probability  0.  However,  if 
player  1  uses  a  randomized  strategy  that  chooses  both  bits 
at  random  with  uniform  probability,  then  the  game  enters 
state  Swin  with  probability  1/2,  regardless  of  the  strategy  of 
player  2;  indeed,  the  value  of  the  game  at  so  is  1/2. 

The  value  of  deterministic  turn-based  games  with  uj- 
regular  winning  conditions  can  be  computed  with  the  al¬ 
gorithms  of  [2,  11,  8,  26].  The  algorithms  of  [8]  are  based 
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on  the  use  of  game  p-calculus,  obtained  by  replacing  the 
predecessor  operator  Pre  of  classical  p-calculus  [14]  by  the 
controllable  predecessor  operator  Cpre:  for  a  set  of  states  U, 
the  set  Cpr e({7)  consists  of  the  states  from  which  player  1 
can  force  the  game  into  U  in  one  step.  A  richer  version  of 
game  p-calculus  was  used  in  [6]  to  provide  qualitative  solu¬ 
tions  for  concurrent  probabilistic  games  with  cj-regular  con¬ 
ditions.  There,  multi-argument  predecessor  operators  are 
used  to  compute  the  set  of  states  from  which  player  1  can 
win  with  probability  1,  or  arbitrarily  close  to  1. 

We  introduce  quantitative  game  p-calculus ,  and  use  it  to 
provide  a  uniform  framework  for  understanding  and  solv¬ 
ing  concurrent  games  with  cj-regular  winning  conditions.  In 
quantitative  game  p-calculus,  sets  of  states  are  replaced  by 
functions  from  states  to  the  interval  [0, 1],  and  the  control¬ 
lable  predecessor  operator  Cpre  is  replaced  by  a  quantitative 
version  Ppre.  Given  a  function  /  from  states  to  the  inter¬ 
val  [0,1],  the  function  g  =  Ppre(/)  associates  with  each 
state  the  maximal  expected  value  of  /  that  player  1  can 
ensure  in  one  step.  The  operator  Ppre  can  be  evaluated 
using  results  about  matrix  games  [29,  23].  Related  quanti¬ 
tative  predecessor  operators  for  one-player  structures  were 
considered  in  [13,  20,  12,  18].  We  show  that  the  values 
of  concurrent  games  with  cj-regular  conditions  can  be  ob¬ 
tained  simply  by  replacing  Cpre  by  Ppre  in  the  solutions 
of  [8].  The  result  is  surprising  because  concurrent  games 
differ  from  turn-based  deterministic  games  in  several  fun¬ 
damental  respects.  First,  concurrent  games  require  in  gen¬ 
eral  the  use  of  randomized  strategies,  as  remarked  above. 
Second,  even  for  the  simple  winning  condition  of  reachabil¬ 
ity,  optimal  strategies  may  not  exist:  one  can  only  guar¬ 
antee  the  existence  of  £-optimal  strategies  for  all  £  >  0 
[9] .  Third,  whereas  finite- memory  strategies  suffice  for  win¬ 
ning  deterministic  turn-based  games,  in  concurrent  games 
both  e-optimal  strategies,  and  optimal  strategies  if  they  ex¬ 
ist,  may  need  an  infinite  amount  of  memory  [6].  Fourth, 
the  standard  recursive  structure  of  proofs  for  deterministic 
turn-based  games  [19,  26]  breaks  down,  as  both  players  can 
choose  a  distribution  over  moves  at  each  state. 

We  develop  the  arguments  both  for  deterministic  and  for 
probabilistic  concurrent  games.  Hence,  as  a  special  case  we 
solve  probabilistic  turn-based  games  with  cj-regular  winning 
conditions,  which  was  also  an  open  problem.  The  quantita- 
tive  game  p-calculus  solution  formulas  provide  the  value  also 
of  games  with  countable,  rather  than  finite,  state  space.  We 
also  characterize  the  optimality,  and  the  memory  require¬ 
ments,  of  the  winning  strategies.  In  particular,  we  show 
that  while  memoryless  strategies  suffice  for  winning  games 
with  safety  and  reachability  conditions,  Biichi  and  Rabin- 
cliain  conditions  require  the  use  of  strategies  with  infinite 
memory.  The  existence  of  optimal  strategies,  as  opposed  to 
e-optimal,  is  only  guaranteed  in  games  with  safety  winning 
conditons. 

As  remarked  by  [8]  in  the  context  of  deterministic  turn- 
based  games,  the  use  of  p-calculus  for  solving  games  helps 
in  the  formulation  of  the  correctness  arguments.  In  order  to 
argue  the  correctness  of  a  solution  formula,  we  need  to  show 
that  player  1  has  an  optimal  (or  £-optimal)  strategy  that 
realizes  the  value  given  by  the  formula,  and  that  player  2 
has  a  “spoiling”  strategy  that  is  optimal  (or  e-optimal)  for 
the  game  with  the  complementary  condition.  Since  the  op¬ 
erator  Ppre  in  the  solution  formula  refers  to  player  1,  an 
optimal  strategy  for  player  1  can  be  constructed  from  the 


fixpoint  of  the  formula.  On  the  other  hand,  the  derivation 
of  spoiling  strategies  for  player  2  is  not  immediate:  indeed, 
even  for  games  with  safety  or  reachability  conditions,  the 
standard  argument  involves  the  consideration  of  discounted 
versions  of  the  games  (see,  e.g.,  [10]).  In  contrast,  by  writ¬ 
ing  the  solution  formula  in  game  p-calculus,  we  place  the 
burden  of  the  argument  on  the  syntactic  complementation 
of  the  solution  formula.  Specifically,  for  a  winning  condition 
4/,  we  characterize  the  maximal  probabilities  of  winning  the 
game  by  a  p-calculus  formula  </>,  and  from  <f>  we  construct 
an  optimal  (or  £-optimal)  strategy  for  player  1.  The  syn¬ 
tactic  complement  -i <j>  of  <f>  gives  the  maximal  probabilities 
for  player  2  to  win  the  dual  game  with  condition  -i4>.  From 
-i (f>,  we  can  again  construct  an  optimal  (or  £-optimal)  strat¬ 
egy  for  player  2  for  the  game  with  condition  -i’F.  The  two 
constructions  are  enough  to  conclude  the  correctness  of  our 
solution  formulas. 

The  iterative  interpretation  of  quantitative  game  p- 
calculus  leads  to  algorithms  for  the  computation  of  approx¬ 
imate  solutions.  By  representing  value  functions  symbol¬ 
ically,  these  algorithms  may  be  used  for  the  approximate 
analysis  of  games  with  very  large  state  spaces  [3,  7].  Unfor¬ 
tunately,  except  for  safety  and  reachability  conditions,  the 
alternance  of  least  and  greatest  fixpoint  operators  in  the  so¬ 
lution  formulas  leads  to  approximation  schemes  that  do  not 
converge  monotonically  to  the  value  of  a  game.  This  situ¬ 
ation  contrasts  with  the  one  for  Markov  decision  processes, 
where  monotonically-converging  approximation  schemes  are 
available,  and  where  the  maximal  winning  probability  can 
be  computed  in  polynomial  time  by  reduction  to  linear  pro¬ 
gramming  [5].  We  show  that  this  discrepancy  is  no  accident, 
since  the  basic  device  for  solving  Markov  decision  processes 
with  aj-regular  conditions,  viz.,  a  reduction  to  reachability, 
fails  for  games. 

2.  CONCURRENT  GAMES 

For  a  countable  set  A,  a  probability  distribution  on  A  is  a 
function  p:  A  [0, 1]  such  that  ^2aeAp(a)  =  1.  We  denote 
the  set  of  probability  distributions  on  A  by  V(A).  A  (two- 
player)  concurrent  game  structure  Q  =  {S,  Moves,  Ti,  T2,p) 
consists  of  the  following  components: 

•  A  finite  state  space  S. 

•  A  finite  set  Moves  of  moves. 

•  Two  move  assignments  ri,r2  :  S  i-»  2^oves  \  0.  For 
i  €  {1,2},  assignment  T,:  associates  with  each  state 
s  6  S  the  non-empty  set  r,:(s)  C  Moves  of  moves 
available  to  player  i  at  state  s. 

•  A  probabilistic  transition  function  p,  that  gives  the 
probability  p(t  \  s,  ai ,  02)  of  a  transition  from  s  to  t  for 
all  s,t  €  5  and  all  moves  ai  €  Ti(s)  and  02  €  T2(s). 

At  every  state  s  €  5,  player  1  chooses  a  move  ai  €  ri(s), 
and  simultaneously  and  independently  player  2  chooses  a 
move  02  €  T2(s).  The  game  then  proceeds  to  the  successor 
state  t.  with  probability  p(t  \  s,  01,02),  for  all  t  £  5.  We  as¬ 
sume  that  the  players  act  non-cooperatively,  i.e.,  each  player 
chooses  her  strategy  independently  and  secretly  from  the 
other  player,  and  is  only  interested  in  maximizing  her  own 
reward.  A  path  of  Q  is  an  infinite  sequence  s  =  so,  si,s2,  ■  ■  ■ 
of  states  in  5  such  that  for  all  k  >  0,  there  are  moves 
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«i  e  ri(sfc)  and  02  €  r2(sfc)  with  p(sk+ 1  |  s*j,ai,a2)  >  0. 
We  denote  by  12  the  set  of  all  paths. 

We  distinguish  the  following  special  classes  of  concurrent 
game  structures. 

•  A  concurrent  game  structure  Q  is  deterministic  if  for 
all  s  6  S  and  all  oi  6  Ti(s),  o2  e  T2(s),  there  is  a 
t  €  S  such  that  p(t,  |  s,  01,02)  =  1. 

•  A  concurrent  game  structure  Q  is  turn-based  if  at  every 
state  at  most  one  player  can  choose  among  multiple 
moves;  that  is,  if  for  every  state  s  €  S  there  exists  at 
most  one  i  6  {1,  2}  with  |r,(s)|  >  1. 

For  brevity,  we  refer  to  concurrent  turn-based  game  struc¬ 
tures  simply  as  turn-based  game  structures. 

2.1  Randomized  strategies 

A  strategy  for  player  i  €  {1,2}  is  a  mapping  m  :  S+  1— » 
X "’(Moves)  that  associates  with  every  nonempty  finite  se¬ 
quence  a  6  S+  of  states,  representing  the  past  history  of 
the  game,  a  probability  distribution  7Ti  (<t)  used  to  select 
the  next  move.  Thus,  the  choice  of  the  next  move  can  be 
history-dependent  and  randomized.  The  strategy  77,  can 
prescribe  only  moves  that  are  available  to  player  i\  that  is, 
for  all  sequences  <7  6  S*  and  states  s  €  5,  we  require  that 
7 Ti(as)(a)  >  0  iff  a  €  Tj(s).  We  denote  by  II,;  the  set  of 
all  strategies  for  player  i  €  {1,2}.  A  strategy  7r  is  deter¬ 
ministic  if  for  all  a  €  S+  there  exists  a  €  Moves  such  that 
7r(<r)(a)  =  1.  Thus,  deterministic  strategies  are  equivalent 
to  functions  S+  Moves.  A  strategy  77  is  finite-memory  if 
the  distribution  chosen  at  every  state  s  €  S  depends  only  on 
s  itself,  and  on  a  finite  number  of  bits  of  information  about 
the  past  history  of  the  game.  A  strategy  7r  is  memoryless  if 
Tv(crs)  =  7 r(s)  for  all  s  €  S  and  all  a  €  S*  ■ 

Once  the  starting  state  s  and  the  strategies  7Ti  and  7r2 
for  the  two  players  have  been  chosen,  the  game  is  reduced 
to  an  ordinary  stochastic  process.  Hence,  the  probabilities 
of  events  are  uniquely  defined,  where  an  event  A  C  12  is  a 
measurable  set  of  paths1 .  For  an  event  A  C  12,  we  denote  by 
Prs1,7r2(^4)  the  probability  that  a  path  belongs  to  A  when 
the  game  starts  from  s  and  the  players  use  the  strategies 
7Ti  and  7r2 .  Similarly,  for  a  measurable  function  /  that  as¬ 
sociates  a  number  in  1R  U  {00}  with  each  path,  we  denote 
by  E s1,7r2{f}  the  expected  value  of  /  when  the  game  starts 
from  s  and  the  strategies  7Ti  and  7r2  are  used.  We  denote 
by  &i  the  random  variable  representing  the  «-th  state  of  a 
path;  formally,  0,:  is  a  variable  that  assumes  value  s,  on  the 
path  so,  si,  s2, . . . . 

2.2  Winning  conditions 

Given  a  concurrent  game  structure  Q  = 
(S,  Moves,  Ti,  r2,p),  we  consider  winning  conditions 
expressed  by  linear-time  temporal  logic  (LTL)  formulas, 
whose  atomic  propositions  correspond  to  subsets  of  the 
set  S  of  states  [16].  We  focus  on  winning  conditions  that 
correspond  to  safety  or  reachability  properties,  as  well  as 
winning  conditions  that  correspond  to  the  accepting  criteria 
of  Biichi,  co-Biichi,  and  Rabin-chain  automata  [21,  8].  We 
call  games  with  such  winning  conditions  safety,  reachability, 
Biichi,  co-Biichi,  and  Rabin-chain  games,  respectively.  The 

:To  be  precise,  we  should  define  events  as  measurable  sets  of 
paths  sharing  the  same  initial  state.  However,  our  (slightly) 
improper  definition  leads  to  more  concise  notation. 


ability  to  solve  games  with  Rabin-chain  conditions  suffices 
for  solving  games  with  arbitrary  LTL  (or  cj-regular)  winning 
conditions:  in  fact,  it  suffices  to  encode  the  OJ-regular  con¬ 
dition  as  a  deterministic  Rabin-chain  automaton,  solving 
then  the  game  consisting  of  the  synchronous  product  of  the 
original  game  with  the  Rabin-chain  automaton  [21,  26]. 

Given  an  LTL  winning  condition  H/,  by  abuse  of  notation 
we  denote  equally  by  \F  the  set  of  paths  s6ll  that  satisfy 
IF;  this  set  is  measurable  for  any  choice  of  strategies  for  the 
two  players  [28].  Hence,  the  probability  that  a  path  satisfies 
IF  starting  from  state  s  €  S  under  strategies  7Ti ,  7r2  for  the 
two  players  is  Prj1,7r2(VF).  Given  a  state  s  €  S  and  a  win¬ 
ning  condition  IF,  we  are  interested  in  finding  the  maximal 
probability  with  which  player  i  6  {1,  2}  can  ensure  that  IF 
holds  from  s.  We  call  such  probability  the  value  of  the  game 
'St  at  s  for  player  i  e  {1,  2}.  This  value  for  player  1  is  given 
by  the  function  (l)'F  :  S  H  [0, 1],  defined  for  all  s  €  S  by 

<l)>F(s)  =  sup  inf  Pr*1’"2^). 

a  j  Cll  ]  "2  6  02 

The  value  for  player  2  is  given  by  the  function  {2)\F,  defined 
symmetrically.  Concurrent  games  satisfy  a  quantitative  ver¬ 
sion  of  determinacy  [17],  stating  that  for  all  LTL  conditions 
IF  and  all  s  €  S,  we  have 

{!)*(«)  =  1  -  <2h<P(«). 

A  strategy  7Ti  for  player  1  is  optimal  if  for  all  s  €  S  we  have 

inf  Pr^1,7r2  =  (l)'F(s). 

7r2en2 

For  e  >  0,  a  strategy  7Ti  for  player  1  is  e-optimal  if  for  all 
s  €  S  we  have 

inf  Pr^1,7r2  >  (l)'F(s)  —  e. 

7r2Gn2 

Note  that  the  quantitative  determinacy  of  concurrent  games 
is  equivalent  to  the  existence  of  £-optimal  strategies  for  all 
£  >  0  at  all  states  s  €  S.  For  the  special  case  of  determin¬ 
istic  turn-based  games,  it  is  known  that  the  value  of  any 
cj-regular  game  at  a  state  is  either  0  or  1,  and  finite-memory 
deterministic  optimal  strategies  always  exist;  the  value  of 
the  game  can  be  computed  with  the  algorithms  of  [2,  11,  8]. 

2.3  Predecessor  operators 

Let  T  be  the  space  of  all  functions  S  [0, 1]  that  map 
states  into  the  interval  [0, 1].  Given  two  functions  f,g  €  T, 
we  write  f  >  g  (resp.  /  >  g)  if  f(s)  >  g(s)  (resp.  f(s)  > 
g(s))  at  all  s  €  5,  and  we  define  /  A  g  and  /  V  g  by 

(/  A  g)(s)  =  min  {/(s),  g(s)} 

(fVg)(s)  =  ™ax{f(s),g(s)} 

for  all  s  G  S.  We  denote  by  0  and  1  the  constant  functions 
that  map  all  states  into  0  and  1,  respectively.  For  all  /  6  T, 
we  denote  by  1  —  /  the  function  defined  by  (1  —  f)(s)  = 
1  —  f(s)  for  all  s  6  5.  Given  a  subset  Q  C  S  of  states,  by 
abuse  of  notation  we  denote  also  by  Q  the  indicator  function 
of  Q,  defined  by  Q(s)  =  1  if  s  G  Q  and  Q(s)  =  0  otherwise. 
We  denote  by  ->Q  =  S\Q  the  complement  of  the  subset  Q  in 
5,  and  again  we  denote  equally  by  ->Q  the  indicator  function 
of  -iQ.  We  denote  by  Ti  C  T  the  set  of  indicator  functions. 
The  quantitative  predecessor  operators  Ppre1,Ppre2  :  T  1— » 
T  are  defined  for  every  /  €  T  by 

PPre1(/)(«)=  sup  inf F  E^2 {/(0i)} 

Trieni  ^2en2 
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and  symmetrically  for  Ppre2.  Intuitively,  the  value 
Ppre»(/)(s)  *s  the  maximum  expectation  for  the  next  value 
of  /  that  player  i  e  {1,2}  can  achieve.  Given  /  6  T  and 
i  €  {1,  2},  the  function  Ppr ei(f)  can  be  computed  by  solving 
the  following  matrix  game  at  each  s  €  S'. 

Pprei(/)(s)  =  vali \^2f(t)p(t  |  s,oi,o2)l 

-loier1(s),a2er2(s) 

The  existence  of  solutions  to  the  above  matrix  games,  and 
the  existence  of  optimal  randomized  strategies  for  players 
1  and  2,  is  guaranteed  by  the  minmax  theorem  [29].  The 
matrix  games  may  be  solved  using  traditional  linear  pro¬ 
gramming  algorithms  (see,  e.g.,  [23]).  From  properties  of 
matrix  games  we  have  the  following  facts.  For  i  €  {1,  2},  the 
operator  Ppre,  is  monotonic  and  continuous,  that  is,  for  all 
/,  g  €  JF,  if  /  >  g  then  Ppr e4(/)  >  Ppre,,(p);  and  for  all  /i  < 
fi  <  •  •  •  in  J7,  we  have  limn  Ppre,,(/„)  =  Ppre,,(limn  /„). 
Moreover,  the  operators  Pprex  and  Ppre2  are  dual:  for  all 
f  €  J7,  we  have  Ppr e1(f)  =  1  —  Ppre2(l  —  /). 

2.4  Quantitative  game  //-calculus 

We  write  the  solutions  of  games  with  respect  to  cj-regular 
winning  conditions  in  quantitative  game  p-calculus.  The  for¬ 
mulas  of  the  quantitative  game  //-calculus  are  generated  by 
the  grammar 

<t>  ■■■■=  Q  \  X  \  <t>  V  <f>  \  <l>  A  <t>  \  Pprej  (</>)  I  Ppre2(</i) 

I  pX.tj)  |  VX.(f),  (1) 

for  proposition  Q  C  S  and  variables  x  from  some  fixed 
set  X.  Hence,  as  for  LTL,  the  propositions  of  quantita¬ 
tive  //-calculus  formulas  correspond  to  subsets  of  states  of 
the  game.  As  usual,  a  formula  <f>  is  closed  if  every  variable 
x  in  (f>  occurs  in  the  scope  of  a  fixpoint  quantifier  fix  or  vx. 

Let  f  :  I  4  f  be  a  variable  valuation  that  associates 
a  function  £{x)  €  T  with  each  variable  x  €  X.  We  write 
£[x  i-4  /]  for  the  valuation  that  agrees  with  £  on  all  vari¬ 
ables,  except  that  x  €  X  is  mapped  to  /  €  T  ■  Given  a  val¬ 
uation  £,  every  formula  tj>  of  quantitative  game  //-calculus 
defines  a  function  \<j>\e  €  J7'. 


[/]f 

=  / 

[x\e 

=  £(») 

[Pprei(0)]f 

=  pPrei(Mf) 

[Ppre2  (</>)]  t- 

=  Ppre2([</>]£) 

[<Ma}^21£ 

=  (IMeWlhh) 

=  e  T  1  /  =  me 

The  existence  and  uniqueness  of  the  above  fixpoints  for  the 
p  and  v  operators  is  a  consequence  of  the  monotonicity  and 
continuity  of  all  the  operators,  and  in  particular  of  Pprej  and 
Ppre2.  As  usual,  the  fixpoints  can  be  evaluated  in  an  itera- 
tive  fashion:  we  have  [px.ipje  =  limn->oo  xn,  where  xo  =  0, 
and  xn+i  =  [<PlE[xi-tx„]  for  n  >  0.  Similarly,  for  the  greatest 
fixpoint  operator  v  we  have  =  lim^-K^a;,,,  where 

xo  =  1,  and  x„+i  =  *.-»*■„]  for  n  >  0.  A  quantitative 

game  //-calculus  formula  suggests  a  way  to  implement  ap¬ 
proximation  algorithms  for  large  state  spaces,  using  a  subset 
T'  C  T  of  base  functions  that  have  compact  representations 
[1,  4,  7].  We  note  that  the  solution  algorithms  presented  in 
this  paper  apply  also  to  games  with  countable  (rather  than 
finite)  state  space  and  finite  set  of  moves  (see  Theorem  4);  in 


this  case,  however,  the  iterative  evaluation  of  the  fixpoints 
needs  to  be  based  on  transfinite  induction. 

The  quantitative  game  //-calculus  defined  by  (1)  suffices 
for  writing  the  solution  formulas  of  games  with  aj-regular 
winning  conditions.  In  some  intermediate  lemmas,  however, 
we  use  with  slight  abuse  of  notation  an  extended  version 
of  the  calculus,  in  which  we  have  one  symbol  /  for  every 
function  f  €  J7.  Obviously,  such  functions  are  interpreted 
as  themselves:  for  all  valuations  £,  we  have  [fje  =  f. 

2.5  Complementation  and  correctness 

We  solve  concurrent  games  with  LTL  winning  condition 
4/  by  providing  a  quantitative  game  //-calculus  formula  <f> 
such  that  (1)4/  =  [</>].  To  prove  this  equality,  we  exploit 
the  complementation  of  //-calculus  expressions.  The  com¬ 
plement  of  a  closed  //-calculus  formula  <j>  is  a  formula  -itj> 
such  that  1  —  [<(>]  =  [-«/>];  the  complement  can  be  obtained 
by  recursively  applying  the  following  transformations,  which 
rely  on  the  duality  of  Pprex  and  Ppre2: 

-'Q  =>  S\Q 


-i(Pprei  (</>))  =>  Ppre2 (—.</>) 

where  <f>\-ix/x\  denotes  the  result  of  replacing  x  with  -ix  in 
(/>.  Note  that  since  the  formula  <f>  is  closed,  by  applying  the 
above  transformations  to  -i <j>  we  obtain  again  a  formula  of 
the  syntactic  form  (1).  In  fact,  the  above  transformations 
push  the  -i  operator  to  the  leaves  of  the  syntax  tree  (1), 
which  consist  either  in  subsets  Q  C  5  or  in  variables  x  €  X. 
The  subsets  are  simply  complemented.  Since  <j>  is  closed, 
each  variable  x  6  X  in  <j>  appears  in  the  scope  of  a  px  or 
vx  quantifier;  the  transformation  rules  for  p  and  j/,  together 
with  the  rule  for  double  negation  elimination,  ensure  that 
once  all  transformations  have  been  applied,  no  -i  operator 
remains  as  prefix  to  a  variable. 

Our  proofs  of  (1)4>  =  [</>]  consist  in  two  steps. 

•  First,  from  <j>  we  construct  for  all  e  >  0  a  strategy  7rf 
for  player  1  that  ensures  winning  with  probability  at 
least  [</>]  —  e,  proving  [</>]  >  (1)4>. 

•  Second,  we  complement  (j).  and  we  consider  the  win¬ 
ning  condition  -i4>.  From  -\<j>  we  construct  for  all  e  >  0 
a  strategy  7r|  that  enables  player  2  to  win  the  game 
with  goal  -i4>  with  probability  at  least  [-></>]  —  e;  this 
shows  [-1©]  >  (2)-i4>,  or  equivalently  [©]  <  (1)4>. 

Even  in  the  cases  where  solution  formulas  for  concurrent 
games  are  known,  such  as  for  the  reachability  winning  con¬ 
dition  (see  [10],  Chapter  4.4),  this  approach  yields  simpler 
arguments  than  the  classical  one,  where  the  £-optimal  strate¬ 
gies  for  both  players  have  to  be  constructed  from  the  solu¬ 
tion  formula  (j)  for  player  1  alone,  and  where  it  is  usually 
necessary  to  consider  discounted  versions  of  the  games. 

3.  REACHABILITY  AND  SAFETY  GAMES 

Concurrent  reachability  and  safety  games  can  be  solved 
by  reducing  them  to  positive  stochastic  games  [27,  10].  We 
present  the  solution  algorithms,  reformulating  them  in  quan¬ 
titative  game  /t-calculus.  As  mentioned  above,  by  relying  on 
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the  complementation  of  quantitative  game  p-calculus,  we  are 
able  to  prove  the  correctness  of  the  solutions  without  resort¬ 
ing  to  the  consideration  of  discounted  versions  of  the  same 
games. 

A  concurrent  reachability  game  consists  in  a  concurrent 
game  structure  Q  =  <5,  Moves,  Ti,  T2,p)  together  with  a 
winning  condition  OU ,  where  U  C.  S.  Intuitively,  the  win¬ 
ning  condition  consists  in  reaching  the  subset  U  of  states. 
The  solution  of  such  a  reachability  game  is  given  by 

<1)01/ =[^c.(t/VPpre1  (*))].  (2) 

This  solution  can  be  computed  iteratively  as  the  limit 
{1)0 U  =  limbec  Xk,  where  xo  =  0  and  xu+i  =  U  V 
Ppre1(x*)  for  k  >  0.  This  iteration  scheme  gives  an  approx¬ 
imation  scheme  to  solve  the  reachability  game.  In  Markov 
decision  processes,  one  can  reduce  the  reachability  question 
to  a  linear  programming  problem  which  can  then  be  solved 
exactly.  This  gives  an  alternative  to  value  iteration.  Unfor¬ 
tunately,  for  concurrent  games  we  cannot  reduce  the  prob¬ 
lem  to  linear  programming,  because  the  maximal  probabil¬ 
ity  of  winning  in  a  game  where  all  probabilities  are  rationals 
may  still  be  irrational  (see  e.g.  [24]). 

Example  1.  Consider  a  concurrent  game  with  three 
states  s,  t,  and  u,  and  winning  condition  <0{u}.  The  tran¬ 
sition  relation  is  as  follows:  from  state  t,  player  1  has 
two  choices  a i  and  b\,  and  player  2  the  choices  02  and 
62-  The  transition  probabilities  are:  Pr(«|t, ai, (12)  = 
Pr(t|t,ai,«2)  =  Pr(u|t,6i,a2)  =  Pr(«|t, 01, 62)  =  0, 

Pr(s|t,6i,a2)  =  Pr(s|t, 01, 62)  =  1,  Pr(u|t, 61 , 62)  =  §,  and 
Pr(t|f,  61, 62)  =  \  ■  The  states  s  and  u  are  absorbing:  the 
game  never  leaves  s  or  u  once  it  reaches  these  states.  The 
maximal  probability  of  winning  the  game  0{u}  is  given  by 
the  least  fixpoint  of  x  =  Ppre1  (x)  V  {u};  for  state  t,  we  have 
x(t)  =  ~3+2%/5.  ■ 

To  prove  (2),  we  show  separately  the  two  inequalities 

<1)017  >  \px\U  V  Pprej  (a;))] 

<1)0*7  <  ltix.(U  V  Ppre1(a:))]. 

The  first  inequality  is  a  consequence  of  the  following  lemma; 
the  second  inequality,  as  mentioned  in  Section  2.5,  will  follow 
from  results  on  safety  games. 

Lemma  1.  Let  w  =  \/ix.(U  V  Ppre1(x))].  For  all  e  >  0 
player  1  has  a  strategy  nf  such  that  Prf1 ,7r2  (0*7)  >  w(s)  —  e 
for  all  772  G  II2  and  all  s  G  S. 

The  proof  follows  a  classical  argument  (see,  e.g.,  [9,  10]). 
For  n  >  0,  consider  the  71-step  version  of  the  game,  whose 
winning  condition  OnU  requires  reaching  U  in  at  most  n 
steps.  Let  also  £0  =  0  and  xn+i  =  t/VPpre1(a;n)  for  n  >  0. 
By  induction  on  n,  we  can  show  that  (l)OnU  >  x„  for  all 
n  >  0.  The  result  then  follows  from  w  =  lim^-jooXn,  and 
from  the  fact  that  OnU  implies  OU  for  all  n  >  0. 

A  concurrent  safety  game  consists  in  a  concurrent  game 
structure  Q  =  <S,  Moves,  Ti,  T2,p)  together  with  a  winning 
condition  DU.  where  U  C  S.  Intuitively,  the  winning  condi¬ 
tion  consists  in  staying  forever  in  the  subset  U  of  states. 
The  complement  of  the  reachability  condition  OU  is  the 
safety  condition  U~U,  and  the  complement  of  the  quan¬ 
titative  game  p-calculus  formula  px.(U  V  Pprex(x))  is 

ux.(-iU  APpre2(a;)), 


where  -1 U  is  an  abbreviation  for  S\U.  We  will  show  that 
the  solution  of  concurrent  safety  games  is  given  by 

<1)D*7  =  [ux.(U  A  Ppre1(x))],  (3) 

which  is  dual  to  (2).  To  this  end,  we  prove  the  following 
lemma. 

Lemma  2.  Let  w  =  \vx.(U  A  Ppre1(x ))].  Player  1  has 
a  strategy  77i  such  that  Pr*1’*2  (□{/)  >  w(s)  for  all  772  G  II2 
and  all  s  6  S. 

The  lemma  can  be  proved  using  standard  arguments  about 
positive  reward  games  [10].  We  present  here  a  more  direct 
proof,  that  will  lead  to  the  arguments  for  Biichi  and  co-Biichi 
games. 

Proof.  Let  771  be  a  strategy  for  player  1  that  at  all  s  G  U 
plays  according  to  an  optimal  distribution  of  the  matrix 
game  corresponding  to  Ppre1(w)(s),  and  at  all  s  G  S  \  U 
plays  arbitrarily.  Fix  a  state  so  G  S  and  an  arbitrary  strat¬ 
egy  772  G  n2.  The  process  {77n}„>0  defined  by  Hn  =  w(On) 
is  a  submartingale  [30]:  in  fact,  from  w(s )  =  Ppre1(w)(s) 
for  s  G  U  and  from  the  choice  of  771  follows  that 

Esq  ■*»  {Hn+ 1  |  H0,HU...  ,Hn}  >  Hn 

for  all  7i  >  0.  Hence,  we  have  E^ ,7r2  {77n}  >  Ho  =  w(so). 
Moreover,  since  w(s)  <  1  at  all  s  €  5,  by  inspection  we 
have  E ^’n2{Hn}  <  Pr^1’’1'2 (□„{/),  where  UnU  is  the  event 
of  staying  in  U  for  at  least  n  steps.  Combining  these  two 
inequalities  we  obtain  w(so)  <  PrJ01,?r2  (□,„[/),  and  the  result 
follows  from  Pr^ ,7r2  (□*/)  =  limn->oo  Pr^’’1'2  (□„{/).  I 

The  following  theorem  summarizes  the  properties  of  con¬ 
current  reachability  and  safety  games. 

Theorem  1.  The  following  assertions  hold. 

1 .  Concurrent  reachability  and  safety  games  can  be  solved 
according  to  (2)  and  (3). 

2.  Concurrent  reachability  games  have  memoryless  e- 
optimal  strategies;  there  are  deterministic  concurrent 
reachability  games  without  optimal  strategies.  Turn- 
based  reachability  games  always  have  deterministic  and 
memoryless  optimal  strategies. 

3.  Concurrent  safety  games  have  memoryless  opti¬ 
mal  strategies;  there  are  deterministic  concurrent 
safety  games  without  memoryless  deterministic  opti¬ 
mal  strategies.  Turn-based  safety  games  always  have 
deterministic  and  memoryless  optimal  strategies. 

Part  1  is  classical  [9,  10],  except  for  the  notation;  the  result 
also  follows  from  the  combination  of  Lemmas  1  and  2.  The 
existence  of  memoryless  e-optimal  strategies  for  concurrent 
reachability  games  follows  from  [22].  The  existence  of  de¬ 
terministic  concurrent  reachability  games  without  optimal 
strategies  is  demonstrated  by  Example  2  below,  adapted 
from  [9,  15].  The  existence  of  optimal  strategies  for  con¬ 
current  safety  games  is  classical;  it  also  follows  from  the 
proof  of  Lemma  2.  The  existence  of  deterministic  concur¬ 
rent  safety  games  without  optimal  deterministic  strategies 
is  demonstrated  by  the  game  MatchOneBit  described  in 
the  introduction.  The  results  for  turn-based  games  follow 
from  results  on  perfect-information  games;  see  e.g.  [10]. 
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Example  2.  Consider  a  concurrent  game  with  S  = 
{s,t,u};  the  only  state  where  players  can  choose  among 
more  than  one  move  is  s.  We  have  Ti(s)  =  {a,  b},  and 
T2(s)  =  {c,d}.  The  game  has  a  deterministic  transition 
function:  p(s  \  s,a,c)  =  p(t  \  s,a,d)  =  p(t  \  s,b,c)  =  p(u  \ 
s,b,d)  =  1,  all  other  transition  probabilities  are  0.  We  have 
(l)0{t}(s)  =  1.  In  fact,  player  1  can  play  moves  a  and  b 
with  probability  1  —  e  and  e  respectively  to  ensure  a  winning 
probability  of  (1  —  e )  from  s,  for  s  >  0.  However,  player  1 
has  no  optimal  strategy:  if  he  decides  to  play  move  b  at  the 
nth  round,  player  2  can  play  move  d  at  the  n-th  round,  so 
that  the  probability  of  reaching  t  is  always  less  than  1.  I 

4.  BUCHI  AND  CO-BUCHI  GAMES 

A  concurrent  Biichi  game  consists  in  a  concurrent  game 
structure  Q  =  {S,  Moves,  Ti,  T2,p)  together  with  a  winning 
condition  DOt/,  where  U  C  5.  Intuitively,  the  winning  con¬ 
dition  consists  in  visiting  the  subset  U  of  states  infinitely 
often.  The  solution  of  a  concurrent  Biichi  game  is  given  by 

{l)DOf/  =  \vy.px.((-^U  A  Pprex  (x))  V  (U  A  Ppre1(y)))]  . 

(4) 

The  proof  of  (4)  is  based  on  two  lemmas.  The  first  lemma 
generalizes  the  result  about  concurrent  reachability  games. 
Given  a  function  g  €  T  and  a  subset  U  of  states,  we 
let  g(OU)  be  the  function  that  associates  with  each  path 
so,  si,  S2, . . .  the  value  g(si),  for  i  =  min  {k  \  Sk  G  U}  <  oo, 
and  the  value  0  if  Sk  0  U  for  all  k  >  0.  Hence,  g(OU)  is  the 
value  of  g  at  the  state  where  the  path  first  enters  U,  if  such 
a  state  exists,  and  is  0  otherwise.  The  following  lemma  can 
be  proved  similarly  to  Lemma  1. 

Lemma  3.  For  g  G  T  and  U  C  S,  let 

w  =  |/rx.((-if/  A  Ppre1(x))  V  (U  A  £/))]. 

Then,  for  all  e  >  0  player  1  has  a  strategy  rtf  that  ensures 
Es1’^2  {(/(Ot/)}  >  w(s)  —  s  at  all  s  €  S. 

We  call  the  above  game  a  t/(Ot/)-game;  the  strategy  7rf  is  an 
e-optimal  strategy  for  it.  The  following  lemma  shows  that 
the  fixpoint  (4)  is  a  lower  bound  for  the  maximal  probabil¬ 
ity  of  winning  a  concurrent  Biichi  game.  The  upper-bound 
result  will  follow  from  results  on  concurrent  co-Biichi  games. 

Lemma  4.  Let 

w  =  \vy.px.((-^U  A  Ppre^x))  V  (U  A  Pprex  (y)))]. 

For  all  s  >  0  player  1  has  a  strategy  7rf  such  that 
Prf1 ,7T2  (DOt/)  >  w(s)  —  £  for  all  772  G  n2  and  all  s  G  S. 

Proof.  From  e,  construct  a  positive  sequence  {£i}i>0 
with  e*  <  £-  The  strategy  77 \  is  as  follows.  In  S  \  U  the 
strategy  7rf  initially  coincides  with  a  eo-optimal  strategy  for 
the  game  w(OU).  Upon  reaching  U,  the  strategy  7rf  plays 
according  to  an  optimal  distribution  of  the  matrix  game 
corresponding  Ppre1(w),  until  U  is  left.  In  the  following 
-i{/-phase,  77i  coincides  with  a  £i-optimal  strategy  for  the 
game  w(OU );  and  so  forth.  Fix  a  state  so  G  5  and  a  strat¬ 
egy  772  G  n2.  Define  the  process  {Hn}n>0,  where  Hn  is  the 
value  of  w  at  the  7r-th  visit  of  U.  From  Lemma  3  and  from 


the  construction  of  7rf,  we  have  E^,7r2{ILi}  >  w(so)  —  £o, 
and  for  n  >  0, 

E:|’*2{fl-„+i  |  H!,H2,  .  .  .  ,  Hn}  >  Hn  -  En. 

By  induction,  this  leads  to 

ElJ’*2{ifn+1}>^So)-£"=o£i 

for  all  n  >  0.  Denoting  by  [DO]ntI  the  event  of  visiting  U 
at  least  n  times,  we  have  Prs01,,r2([no]nt/)  >  E  sq'^2  {Hn}. 
Combining  these  two  results  we  obtain 

Prl^ano^t/)  >  w(s o)  -e, 
and  the  result  then  follows  from 

lim  Pr^dDO]™!/)  =  PrlJ’’r2(nOU). 

n— >oo 


A  concurrent  co-Biichi  game  consists  in  a  concurrent  game 
structure  Q  =  (S,  Moves,  Ti ,  T2,p)  together  with  a  winning 
condition  <>□{/,  where  U  C  S.  Intuitively,  the  winning  con¬ 
dition  consists  in  eventually  staying  forever  in  the  subset  U 
of  states.  The  solution  of  a  concurrent  co-Biichi  game  is 
given  by 

<l)ODt/  =  Ipx.i 'y-((~^U  A  Ppre^x))  V  (U  A  Ppre^y)))]  . 

(5) 

Again,  the  proof  of  the  above  fixpoint  equation  is  based  on 
two  lemmas.  Let  [□!/]  be  the  function  that  associates  with 
each  path  value  1  if  the  path  always  stays  in  U,  and  value  0 
otherwise.  The  first  lemma  generalizes  Lemma  2. 

Lemma  5.  For  g  G  T  and  U  C  S,  let 

w  =  I vy.((U  A  Ppre1  (y))  V  (-. U  A  g))J. 

Then  the  strategy  77 1  of  player  1  that  plays  at  each 
s  G  S  according  to  an  optimal  distribution  of  the 
matrix  game  corresponding  to  Ppre1(w)(s)  is  such  that 
FTC-1"2 {[UU]  +s(0_|U)}  >  w  for  all  s  G  5  and  772  G  n2. 

The  proof  is  similar  to  that  of  Lemma  2.  The  following 
lemma  shows  that  the  fixpoint  of  (4)  is  a  lower  bound  for 
the  maximal  probability  of  winning  the  concurrent  co-Biichi 
game. 

Lemma  6.  Let 

w  =  Igx  ,vy.((-^U  A  Ppre1(x))  V  (U  A  Ppre1(y)))\. 

For  all  £  >  0  player  1  has  a  strategy  nf  such  that 
Prf1 ,7r2  (OIHl/)  >  w(s)  for  all  772  G  n2  and  all  s  €  S. 

Proof.  Denote  by  [<>□[/]«  the  event  of  visiting  -iU  at 
most  n  times.  Let  xo  =  0,  and  for  n  >  0, 

Xn  =  \vy.{{-^U  A  Ppre1(xn_i))  V  ( U  A  Pprex  (y)))]. 

By  induction  on  n  >  0,  we  show  that  player  1  has  a  strategy 
77™  such  that  Prs",7r2([on{/]n)  >  xn(s)  for  all  s  G  5  and 
all  772  G  II2;  the  result  will  then  follow  by  taking  the  limit 
n  — >  00.  The  base  case  is  trivial.  For  n  >  0,  the  strategy 
77™  plays  according  to  an  optimal  distribution  of  the  matrix 
game  corresponding  to  Ppre1(x„)  as  long  as  U  is  not  left. 
At  the  first  visit  to  -1 U,  the  strategy  77™  plays  one  round 
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according  to  an  optimal  distribution  of  the  matrix  game 
corresponding  to  Ppre^Zn-i),  and  switches  thereafter  to 
the  strategy  7T™-1.  By  definition  of  7 r",  from  the  previous 
lemma  we  have 

Prf  ’*2  ([one/]*)  >  Pr?"  ’*2  (Dt/)  +  E:"  ’*2  {xn-!  (O-V)} 

>  Xn- 


The  following  theorem  summarizes  the  results  about  con¬ 
current  Biichi  and  co-Biichi  games. 

Theorem  2.  The  following  assertions  hold. 

1.  Concurrent  Biichi  and  co-Biichi  games  can  be  solved 
according  to  (4)  and  (5). 

2.  There  are  deterministic  concurrent  Biichi  games  with¬ 
out  optimal  strategies,  and  without  finite-memory  e- 
optimal  strategies.  Turn-based  Biichi  games  always 
have  deterministic  and  memoryless  optimal  strategies. 

3.  There  are  deterministic  concurrent  co-Biichi  games 
without  optimal  strategies.  Turn-based  co-Biichi  games 
always  have  deterministic  and  memoryless  optimal 
strategies. 

Part  1  follows  from  Lemmas  4  and  6,  and  from  quantitative 
game  yn-calculus  complementation.  Part  2  follows  from  the 
lack  of  optimal  strategies  for  reachability  (see  Example  2), 
and  from  the  fact  that  Biichi  games  are  equivalent  to  iterated 
reachability  games  (see  [6]  for  an  example).  Part  3  is  a 
consequence  of  the  lack  of  optimal  strategies  for  concurrent 
reachability  games. 

5.  RABIN-CHAIN  GAMES 

A  concurrent  Rabin-chain  game  consists  in  a  concurrent 
game  structure  Q  =  (S,  Moves,  Ti,  T2,p)  together  with  a 
winning  condition 

k- 1 

n=\J  [noU2i  a  -.001724+1 ) , 

4=0 

where  k  >  0  and  0  =  Uik  C  U-ik-i  C  LT2k-2  C  ■  ■  ■  C  Uo  =  S. 
A  more  intuitive  characterization  of  this  winning  condition 
can  be  obtained  by  defining,  for  0  <  i  <  2k  —  1,  the  set  Ci 
of  states  of  color  i  by  Ci  =  U%  \  XJi+ 1 .  The  total  number  of 
colors  is  N  =  2k.  Given  a  path  s,  let  Infi(s )  C  S  be  the  set 
of  states  that  occur  infinitely  often  along  s,  and  let 

MaxCol(s )  =  max  {i  €  {0, .  . .  ,  TV  —  1}  |  C4  PI  Infills)  ^  0} 

be  the  largest  color  appearing  along  the  path.  Then, 

71  =  {s  €  fl  |  MaxCol(s)  is  even}. 

The  solution  (1)7?.  for  a  Rabin-chain  condition  with  N  colors 
is  given  by 

JV-l 

<1)72.  =  [Ajv-ixiv-i - fj,xi.vxo.(  \J  (Ci  A  Ppre^^)))] 

4=0 

(6) 

where  A„  =  v  if  n  is  even,  and  =  p  if  n  is  odd  (com¬ 
pare  with  [8]).  The  proof  of  (6)  is  based  on  the  follow¬ 
ing  inductive  decomposition,  inspired  by  the  one  of  [8]. 


We  denote  by  C<n  =  Ur=o  C*  (resP-  C>n  =  Ufl„+i  Ci  and 
C<n  =  U+Tc)1  Ci)  the  set  of  states  colored  by  colors  less  than 
or  equal  to  n  (resp.,  greater  than  n,  and  smaller  than  n). 
Let  z£f,  and  for  n  >  0  define  Jn  by  J-i(z)  =  z,  and 

Jn(z)  =  \nx.J„-i((C„  A  Pprex (x))  V  (C>„  A  z)). 

We  can  show  by  induction  on  n  that  [ J„  (2)]  is  the  function 
that  gives  the  maximal  expectation  of  either  winning  the 
concurrent  Rabin-chain  game  while  visiting  only  states  in 
C<n,  or  of  the  value  z(OC>n)  if  C<n  is  exited.  Denote 
by  [R  A  □(?<„]  the  random  function  that  has  value  1  over 
a  path  exactly  when  the  path  satisfies  condition  1Z  while 
visiting  only  states  in  C<n.  The  lemma  below  makes  this 
statement  precise. 

Lemma  7.  For  all  e  >  0,  all  z  €  T ,  and  all  states  s  €  S, 
there  is  a  strategy  7Ti  €  IR  for  player  1  such  that  for  all 
strategies  7T2  €  II2  of  player  2,  we  have 

K^{[R.  A  aC<„]  +  z(OC>n)}  >  [Jn (*)](«)  -  £■ 

The  proof  of  the  lemma  is  similar  to  the  proof  of  the  lem¬ 
mas  for  the  Biichi  and  co-Biichi  conditions;  we  sketch  the 
inductive  step  for  n  odd  (i.e. ,  A„  =  p).  From  e,  construct  a 
positive  sequence  {£4}4>o  with  sum  less  than  e.  Let  xo  =  0, 
and  for  k  >  0,  let 

Xk  =  [Jn.-i(C„  A  Ppre(**_i)  V  C>n  A  2)]. 

By  induction  on  k,  we  show  that  player  1  has  a  strategy  Hi 
such  that 

Pr V™{[71  A  aC<n]  +  z(OC>n)}  >  xk(s)  -  Eto  ^ 

for  all  s  €  S  and  7T2  €  n2.  The  strategy  77*  for  player  1  for 
player  1  coincides  with  an  ek -optimal  strategy  in  the  game 
Jn—  1  (Cn  A  Ppre(xfc_i)  V  C>n  A  2)  while  the  game  remains 
in  C<„;  when  Cn  is  hit  for  the  first  time,  it  plays  an  opti¬ 
mal  strategy  in  the  matrix  game  Ppre(*A;_i),  and  thereafter 
switches  to  the  inductively  constructed  strategy  7r* -1 .  Then 

E^’*2  {[71  A  □  C<„]  +  z(OC>„)} 

>  [J«-i(C„  A  Ppre(**_i)  V  C>n  A  «)](«)  -  J2i=o£k 
=  Xk(s)  -  E*  U£k, 

for  all  s  €  S  and  772  €  II2,  using  the  induction  hypothesis 
on  Xk- 1,  and  the  claim  follows  by  taking  k  — >  00.  A  similar 
argument  works  for  n  even  (i.e.,  =  v).  The  value  of  the 

game  with  condition  TZ  is  then  [Jjv-i(0)].  Both  the  lower 
and  the  upper  bounds  for  the  value  of  the  game  follow  from 
the  lemma,  because  Rabin-chain  games  are  self-dual  (the 
complement  of  a  concurrent  Rabin-chain  game  is  again  a 
concurrent  Rabin-chain  game).  We  can  now  summarize  the 
results  for  concurrent  Rabin-chain  games. 

Theorem  3.  The  following  assertions  hold. 

1.  Concurrent  Rabin-chain  games  can  be  solved  according 
to  (6). 

2.  There  are  deterministic  concurrent  Rabin-chain  games 
without  optimal  strategies  and  without  finite-memory 
s-optimal  strategies.  Turn-based  Rabin-chain  games 
always  have  deterministic  and  memoryless  optimal 
strategies. 
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Figure  1:  A  turn-based  game  that  disproves  the  reduction  to  reachability.  A  label  ( a,b )  of  an  edge  (or  of  a 
probabilistic  bundle  of  edges)  indicates  that  the  edge  is  followed  when  player  1  chooses  move  a  and  player  2 
chooses  move  b. 


Finally,  the  next  theorem  states  that  if  the  state  space 
is  countable,  rather  than  finite,  the  quantitative  game  fi- 
calculus  solutions  presented  in  this  paper  still  define  the 
value  of  the  game. 

Theorem  4.  Consider  a  concurrent  game  structure  Q  = 
{S,  Moves,  Ti,  F2,p),  where  S  is  countable.  Then,  formulas 

(2),  (3),  (4),  (5),  and  (6)  provide  the  solutions  for  concur¬ 
rent  reachability,  safety,  Biichi,  co-Biichi,  and  Rabin-chain 
games,  respectively. 

This  theorem  can  be  proved  by  the  same  arguments  used  for 
finite  concurrent  games,  using  transfinite  induction  rather 
than  ordinary  induction  when  arguing  about  the  least  and 
greatest  fixpoints  of  the  calculus. 

A  comparison  with  Markov  decision  processes 

A  Markov  decision  process  is  a  concurrent  game  structure 
where  |T2 (s) |  =  1  for  all  s6S.  In  a  Markov  decision  process, 
the  problem  of  computing  the  maximal  probability  of  satis¬ 
fying  a  Biichi,  co-Biichi,  or  Rabin-chain  condition  4'  can  be 
solved  in  polynomial  time,  by  reducing  it  to  the  problem  of 
computing  a  maximal  reachability  probability  [5].  From  4>, 
we  can  first  compute  the  subset  T*  =  {s  £  S  |  (l}4/(s)  =  1} 
of  states  where  the  maximal  probability  of  4/  is  1.  Then, 
we  have  (l)'!'  =  (l)OT*,  indicating  that  the  maximal  prob¬ 
ability  of  satisfying  41  is  equal  to  the  maximal  probability 
of  reaching  T*.  In  concurrent  games,  given  a  Biichi,  co- 
Biichi,  or  Rabin-chain  condition  4>,  we  can  compute  the  set 
T*  with  the  algorithms  of  [6],  setting  T*  =  ({l))fmut4/.  If 
the  equality  (1)4/  =  (l)OT»  held  for  concurrent  games,  it 
would  provide  monotonic  approximation  schemes  for  com¬ 
puting  the  value  of  the  game  (the  problem  would  still  not 
be  reducible  to  linear  programming,  since  the  values  may 
be  irrational,  as  mentioned  earlier).  However,  the  following 
example  demonstrates  that  the  equality  does  not  hold  for 
games. 

Example  3.  Consider  the  turn-based  game  depicted  in 
Figure  1.  Let  U  =  {fi,f2,f4},  and  consider  the  co-Biichi 
winning  condition  ODt/.  The  set  of  states  Ri  (resp.  R2) 
where  player  1  (resp.  2)  can  ensure  winning  (resp.  losing) 
with  probability  1  are  given  by 

Ri  =  Touu  =  {s  <E  S  |  (l)ODU(s)  =  1}  =  {ti} 

R2  =  {s  es  I  (2)oo^u(s)  =  1}  =  {t4,t5}. 

For  i  (E  {1,  2},  the  maximal  probability  for  player  i  of  reach¬ 
ing  R4  from  outside  R4  is  zero:  (l)ORi(tk)  =  0  for  k  7^  1, 
and  {2)OR2(tk)  =  0  for  k  0  {4,5}.  Nevertheless,  we  can 
verify  that  (l)ODl/(t2)  =  2/3,  and  (l)ODt/(t3)  =  1/3.  I 
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